STAR/mmCIF: An ontology for macromolecular structure

نویسندگان

  • John D. Westbrook
  • Philip E. Bourne
چکیده

MOTIVATION Crystallographers were motivated 10 years ago to develop a simple and consistent data representation for the exchange and archiving of data associated with the crystallographic experiment and the final structure. As this process evolved (and the data grew at near exponential rates) came the recognition that this representation should also facilitate the automated management of the data and, with the aid of additional software for verification and validation, provide improved consistency and accuracy and hence improved scientific inquiry. This realization led to a new Dictionary Definition Language (DDL) and an extensive dictionary based on this DDL for describing macromolecular structure. In broad terms this could be considered an ontology. An important feature in the development of the ontology was the endorsement and ongoing maintenance and support of the International Union of Crystallography (IUCr). While the description of macromolecular structure and the x-ray crystallographic experiment used to derive it represent explicit data, the ontology is extensible and applicable to other less well-characterized data domains. RESULTS Details of the DDL, the dictionaries that have been developed, and software for reading and using this ontology are presented. AVAILABILITY Extensive documentation, software tools and the DDL and dictionaries are available from http://ndbserver.rutgers.edu/mmcif and associated mirror sites. CONTACT Bourne: [email protected] and Westbrook:[email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Code Generation through Annotation of Macromolecular Structure Data

The maintenance of software which uses a rapidly evolving data annotation scheme is time consuming and expensive. At the same time without current software the annotation scheme itself becomes limited and is less likely to be widely adopted. A solution to this problem has been developed for the macromolecular Crystallographic Information File (mmCIF) annotation scheme. The approach could be gen...

متن کامل

The Macromolecular Crystallographic Information File (mmCIF)

Introduction The Protein Data Bank (PDB) format provides a standard representation for macromolecular structure data derived from X-ray diffraction and NMR studies. This representation has served the community well since its inception in the 1970's (Bernstein et al. 1) and a large amount of software that uses this representation has been written. However, it is widely recognized that the curren...

متن کامل

3 . 6 . Classification and Use of Macromolecular Data

The sole data item in the category ENTRY, _entry.id, is a label that identifies the current data block. This label is used as the formal key in several categories that record information that is relevant to the entire data block (e.g. _cell.entry_id, _geom.entry_id), so care should be taken to select a label that is informative and unique. Data items in the ENTRY_LINK category record the relati...

متن کامل

Deposition of structure factors at the Protein Data Bank.

# 1999 International Union of Crystallography Printed in Great Britain ± all rights reserved The Protein Data Bank (PDB) has long made available the experimental data which were used to determine the three-dimensional structures in the database. In recent years more and more depositors and users of the PDB have come to appreciate the importance of reliable access to such fundamental data. The d...

متن کامل

Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format

The Protein Data Bank Japan (PDBj, http://pdbj.org) is a member of the worldwide Protein Data Bank (wwPDB) and accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins, which are...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 16 2  شماره 

صفحات  -

تاریخ انتشار 2000